64 research outputs found

    The top five essential modules that may contain multiple potential drug targets.

    No full text
    <p>In the figure, a diamond node represents a hub protein, and a hexagon node represents a hub and bottleneck protein in the high-confidence network. Larger nodes indicate essential proteins, and smaller ones are non-essential. The majority of module members are hubs and/or bottlenecks in the network, reflecting their essentiality. <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0041202#pone-0041202-g002" target="_blank">Figures 2</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0041202#pone-0041202-g003" target="_blank">3</a>, and <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0041202#pone-0041202-g004" target="_blank">4</a> were drawn using Cytoscape <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0041202#pone.0041202-Shannon2" target="_blank">[72]</a>.</p

    A Statistical Framework for Improving Genomic Annotations of Prokaryotic Essential Genes

    Get PDF
    <div><p>Large-scale systematic analysis of gene essentiality is an important step closer toward unraveling the complex relationship between genotypes and phenotypes. Such analysis cannot be accomplished without unbiased and accurate annotations of essential genes. In current genomic databases, most of the essential gene annotations are derived from whole-genome transposon mutagenesis (TM), the most frequently used experimental approach for determining essential genes in microorganisms under defined conditions. However, there are substantial systematic biases associated with TM experiments. In this study, we developed a novel Poisson model–based statistical framework to simulate the TM insertion process and subsequently correct the experimental biases. We first quantitatively assessed the effects of major factors that potentially influence the accuracy of TM and subsequently incorporated relevant factors into the framework. Through iteratively optimizing parameters, we inferred the actual insertion events occurred and described each gene’s essentiality on probability measure. Evaluated by the definite mapping of essential gene profile in <i>Escherichia coli</i>, our model significantly improved the accuracy of original TM datasets, resulting in more accurate annotations of essential genes. Our method also showed encouraging results in improving subsaturation level TM datasets. To test our model’s broad applicability to other bacteria, we applied it to <i>Pseudomonas aeruginosa PAO1</i> and <i>Francisella tularensis novicida</i> TM datasets. We validated our predictions by literature as well as allelic exchange experiments in <i>PAO1</i>. Our model was correct on six of the seven tested genes. Remarkably, among all three cases that our predictions contradicted the TM assignments, experimental validations supported our predictions. In summary, our method will be a promising tool in improving genomic annotations of essential genes and enabling large-scale explorations of gene essentiality. Our contribution is timely considering the rapidly increasing essential gene sets. A Webserver has been set up to provide convenient access to this tool. All results and source codes are available for download upon publication at <a href="http://research.cchmc.org/essentialgene/" target="_blank">http://research.cchmc.org/essentialgene/</a>.</p> </div

    Prediction and Analysis of the Protein Interactome in <em>Pseudomonas aeruginosa</em> to Enable Network-Based Drug Target Selection

    Get PDF
    <div><p><em>Pseudomonas aeruginosa</em> (<em>PA</em>) is a ubiquitous opportunistic pathogen that is capable of causing highly problematic, chronic infections in cystic fibrosis and chronic obstructive pulmonary disease patients. With the increased prevalence of multi-drug resistant <em>PA</em>, the conventional “one gene, one drug, one disease” paradigm is losing effectiveness. Network pharmacology, on the other hand, may hold the promise of discovering new drug targets to treat a variety of <em>PA</em> infections. However, given the urgent need for novel drug target discovery, a <em>PA</em> protein-protein interaction (PPI) network of high accuracy and coverage, has not yet been constructed. In this study, we predicted a genome-scale PPI network of <em>PA</em> by integrating various genomic features of <em>PA</em> proteins/genes by a machine learning-based approach. A total of 54,107 interactions covering 4,181 proteins in <em>PA</em> were predicted. A high-confidence network combining predicted high-confidence interactions, a reference set and verified interactions that consist of 3,343 proteins and 19,416 potential interactions was further assembled and analyzed. The predicted interactome network from this study is the first large-scale PPI network in <em>PA</em> with significant coverage and high accuracy. Subsequent analysis, including validations based on existing small-scale PPI data and the network structure comparison with other model organisms, shows the validity of the predicted PPI network. Potential drug targets were identified and prioritized based on their essentiality and topological importance in the high-confidence network. Host-pathogen protein interactions between human and <em>PA</em> were further extracted and analyzed. In addition, case studies were performed on protein interactions regarding anti-sigma factor MucA, negative periplasmic alginate regulator MucB, and the transcriptional regulator RhlR. A web server to access the predicted PPI dataset is available at <a href="http://research.cchmc.org/PPIdatabase/">http://research.cchmc.org/PPIdatabase/</a>.</p> </div

    Illustration of the statistical model.

    No full text
    <p>In a TM experiment, if a gene has no observed insertions, meaning it is TM essential or TmEs, what could it be? There are two possibilities: (1) Part A: It never had any insertion and was missed by all transposons by chance. This means we do not have useful information to infer what this gene could be, and it is completely blind for us. For any blind gene, we can only try our best guess and assume that the chance of that gene to be essential is equal to the overall essential gene rate (Pr(overall essential)), and that a gene to be non-essential is equal to  = 1-. (2) Part B: It actually had insertions, but all inserted mutations died. This means that this gene is truly essential. In this way, we can now split the TM assigned essential genes into two parts, TETmE and FETmE. Similarly, if in the TM experiment, a gene is observed to have insertions, meaning it is TM nonessential, what could it really be? There are also two possibilities: (1) Part C: All these observed insertions are ineffective, and did not interrupt the gene function. This means again we are blind about this gene. So it has a certain chance to be essential , and also has a certain chance to be nonessential . (2) Part D: There was at least one effective insertion, and it did interrupt the gene function. . This means this gene is truly non-essential.</p

    Three factors have strong associations with false TM assignments.

    No full text
    <p>(A) Gene length. The lengths of TmEs are significantly shorter than those in the PEC dataset and total genes. Many of these short genes may be false essential genes. (B) Position of insertions. Essential genes mistakenly assigned to be non-essential by TM often have insertions in the 25% extreme-ends (5% in 5′ end and 20% in 3′ end). These insertions do not completely disrupt a gene’s function. (C) Number of insertions. 75% of the essential genes mistakenly assigned to be non-essential by TM only have one insertion in them.</p

    Improvement of overlaps with the PEC dataset using our model.

    No full text
    <p>Improvement of overlaps with the PEC dataset using our model.</p

    A level-1 interaction map for MucA and MucB.

    No full text
    <p>Each node is a protein and each edge is a predicted PPI from the high-confidence network (except the interaction MucA-AlgW, which comes from experimental PPI data). A total of 39 proteins and 199 interactions were captured by the level-1 PPI network for MucA and MucB. 17 Red nodes are essential proteins. Yellow edges indicate high confidence interactions included in the high-confidence network.</p

    Performance of the random forest classifier for the positive class in 10-fold cross-validation.

    No full text
    <p>Performance of the random forest classifier for the positive class in 10-fold cross-validation.</p

    Enrichment of true essential genes using different thresholds of the confidence score.

    No full text
    <p>Enrichment of true essential genes using different thresholds of the confidence score.</p

    Validation using allelic exchange experiments in <i>Pseudomonas aeruginosa PAO1</i>. E – Essential; N – Non-essential.

    No full text
    <p>Validation using allelic exchange experiments in <i>Pseudomonas aeruginosa PAO1</i>. E – Essential; N – Non-essential.</p
    • …
    corecore